Skip to content

Conversation

@yeshsurya
Copy link
Contributor

Summary

This work adds support for gpt oss models

Testing Done:

Inference Benchmark Execution
Total inference runs: 18 executions
12 successful benchmark measurements
6 pre-run warmup iterations
Configurations tested: 3 scenarios with 2 runs each
All scenarios passed with consistent, measurable improvements

  • Hardware Type: RTX A6000
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

yeshsurya and others added 5 commits November 2, 2025 15:45
Added GPT-OSS to the supported models table in README.md with its supported operations (RoPE, RMSNorm, CrossEntropyLoss, FusedLinearCrossEntropy).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@yeshsurya yeshsurya changed the title Yeshwanth/gpt oss [feat]: Add support for gpt-oss Nov 22, 2025
@yeshsurya yeshsurya requested a review from kashif November 24, 2025 09:53
Comment on lines +45 to +62
Returns:

Example:

```python
>>> from transformers import AutoTokenizer, GptOssForCausalLM

>>> model = GptOssForCausalLM.from_pretrained("openai/gpt-oss-20b")
>>> tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

>>> prompt = "Hey, are you conscious? Can you talk to me?"
>>> inputs = tokenizer(prompt, return_tensors="pt")

>>> # Generate
>>> generate_ids = model.generate(inputs.input_ids, max_length=30)
>>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
"Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you."
```"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not seem to match the output of this function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants